Adaptive Estimation of the Optimal ROC Curve and a Bipartite Ranking Algorithm

نویسندگان

  • Stéphan Clémençon
  • Nicolas Vayatis
چکیده

In this paper, we propose an adaptive algorithm for bipartite ranking and prove its statistical performance in a stronger sense than the AUC criterion. Our procedure builds on the RankOver algorithm proposed in (Clémençon & Vayatis, 2008a). The algorithm outputs a piecewise constant scoring rule which is obtained by overlaying a finite collection of classifiers. Here, each of these classifiers is the empirical solution of a specific minimum-volume set (MV-set) estimation problem. The main novelty arises from the fact that the levels of the MV-sets to recover are chosen adaptively from the data to adjust to the variability of the target curve. The ROC curve of the estimated scoring rule may be interpreted as an adaptive spline approximant of the optimal ROC curve. Error bounds for the estimate of the optimal ROC curve in terms of the L∞-distance are also provided.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bipartite Ranking: a Risk-Theoretic Perspective

We present a systematic study of the bipartite ranking problem, with the aim of explicating its connections to the class-probability estimation problem. Our study focuses on the properties of the statistical risk for bipartite ranking with general losses, which is closely related to a generalised notion of the area under the ROC curve: we establish alternate representations of this risk, relate...

متن کامل

A tree-based ranking algorithm and approximation of the optimal ROC curve

Recursive partitioning methods are among the most popular techniques in machine-learning. It is the purpose of this paper to investigate how such an appealing methodology may be adapted to the bipartite ranking problem. In ranking, the goal pursued is global: the matter is to learn how to define an order on the whole feature space X , so that positive instances take up the top ranks with maximu...

متن کامل

Bipartite ranking: risk, optimality, and equivalences

We present a systematic study of the bipartite ranking problem, with the aim of delineating its connections to the class-probability estimation problem. Our study focuses on the properties of the statistical risk for bipartite ranking, which is closely related to the area under the ROC curve: we establish alternate representations of the risk, relate the Bayes-optimal risk to a class of probabi...

متن کامل

Overlaying classifiers: a practical approach to optimal scoring

The ROC curve is one of the most widely used visual tool to evaluate performance of scoring functions regarding their capacities to discriminate between two populations. It is the goal of this paper to propose a statistical learning method for constructing a scoring function with nearly optimal ROC curve. In this bipartite setup, the target is known to be the regression function up to an increa...

متن کامل

Anomaly Ranking as Supervised Bipartite Ranking

The Mass Volume (MV) curve is a visual tool to evaluate the performance of a scoring function with regard to its capacity to rank data in the same order as the underlying density function. Anomaly ranking refers to the unsupervised learning task which consists in building a scoring function, based on unlabeled data, with a MV curve as low as possible at any point. In this paper, it is proved th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009